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Plants having modified growth characteristics and a method for 

making the same 

The present invention concerns a method for modifying growth characteristics of a plant. More 
5 specifically, the present invention concerns a method for modifying growth characteristics of a 
plant by modifying expression of a seedyl nucleic acid and/or by modifying levels and/or 
activity of a seedyl protein in a plant. The present invention also concerns plants having 
modified growth characteristics and modified expression of a seedyl nucleic acid and/or 
modified levels and/or activity of a seedyl protein relative to corresponding wild type plants. 

10 

The ever-increasing world population and the dwindling supply of arable land available for 
agriculture fuel research towards improving the efficiency of agriculture. Conventional means 
for crop and horticultural improvements utilise selective breeding techniques to identify plants 
having desirable characteristics. However, such selective breeding techniques have several 

15 drawbacks, namely that these techniques are typically labour intensive and result in plants that 
often contain heterogeneous genetic components that may not always result in the desirable 
trait being passed on from parent plants. Advances in molecular biology have allowed mankind 
to modify the germplasm of animals and plants. Genetic engineering of plants entails the 
isolation and manipulation of genetic material (typically in the form of DNA or RNA) and the 

20 subsequent introduction of that genetic material into a plant. Such technology has the capacity 
to deliver crops or plants having various improved economic, agronomic or horticultural traits 
A trait of particular economic interest is yield. Yield is normally defined as the measurable 
produce of economic value from a crop. This may be defined in terms of quantity and/or 
quality. Crop yield may not only be increased by combating one or more stresses to which a 

25 crop or plant is typically subjected, but may also be increased by modifying the inherent growth 
characteristics of a plant. Yield is directly dependent on several growth characteristics, for 
example, growth rate, biomass production, plant architecture, number and size of organs, (for 
example, the number of branches, tillers, shoots, flowers), seed production and more. Root 
development and nutrient uptake may also be important factors in determining yield. 

30 

The ability to modify one or more plant growth characteristics, would have many applications in 
areas such as crop enhancement, plant breeding, production of ornamental plants, 
aboriculture, horticulture, forestry, production of algae or plants (for example for use as 
bioreactors, for the production of substances such as pharmaceuticals, antibodies, or 
35 vaccines, or for the byconversion of organic waste or for use as fuel in the case of high- 
yielding algae and plants). 
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H has now been found that modifying expression in a plant of a seedyl nucleic acid and/or 
modifying the level and/or activity in a plant of a seedyl protein gives plants having modified 
growth characteristics relative to corresponding wild type plants. 

5 A seedyl protein is defined herein as being a protein comprising in the following order from N- 
terminus to C-terminus: 

(i) a motif having at least 80% sequence identity to the sequence represented by 
SEQID NO 15; and 

(ii) a motif having at least 80% sequence identity to the sequence represented by 
10 SEQID NO 16; and 

(iii) a motif having at least 80% sequence identity to the sequence represented by 
SEQ ID NO 17, and which motif is a coiled coil motif and 

(iv) a motif having at least 80% sequence identity to the sequence represented by 
SEQ ID NO 18. 

15 

A seedyl nucleic acid/gene is defined herein as being a nucleic acid or gene encoding a 
seedyl protein. The terms "seedyl gene", "seedyl nucleic acid" and "nucleic acid encoding a 
seedyl protein" are used interchangeably herein. The term seedyl nucleic acid/gene, as 
defined herein, also encompasses a complement of the sequence and corresponding RNA, 
20 DNA, cDNA or genomic DNA. The seedyl nucleic acid may be synthesized in whole or in part 
and it may be a double-stranded nucleic acid or a single-stranded nucleic acid. The term also 
encompasses variants due to the degeneracy of the genetic code and variants that are 
interrupted by one or more intervening sequences. 

25 A seedyl nucleic acid/gene or a seedyl protein may be wild type, i.e. a native or endogenous 
nucleic acid or protein. The nucleic acid may be derived from the same or another species, 
which nucleic acid is introduced as a transgene, for example by transformation. This transgene 
may be substantially changed from its native form in composition and/or genomic environment 
through deliberate human manipulation. The nucleic acid may thus be derived (either directly 

30 or indirectly (if subsequently modified)) from any source provided that the nucleic acid, when 
expressed in a plant, gives modified plant growth characteristics. The nucleic acid may be 
isolated from a microbial source, such as bacteria, yeast or fungi, or from a plant, algae, insect, 
or animal (including human) source. Preferably, the seedyl nucleic acid is isolated from a 
plant. The nucleic acid may be isolated from a dicotyledonous species, preferably from the 

35 family Sotanaceae, further preferably from Nicotiana. More preferably, the seedyl nucleic acid 
encodes a seedyl protein as defined hereinabove. Most preferably, the seedyl nucleic acid is 
as represented by SEQ ID NO: 1, or by a portion thereof, or by a nucleic acid capable of 
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hybridising with the sequence represented by SEQ ID NO: 1, or is a nucleic acid encoding an 
amino acid represented by SEQ ID NO: 2 or a homologue derivative or active fragment 
thereof, which homologue has in increasing order of preference at least 20%, 25%, 30%, 35%, 
40%. 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%. 90%, 95% 98% or 99% sequence 
5 identity with the sequence represented by SEQ ID NO 2. 



The present invention provides a method for modifying the growth characteristics of a plant, 
comprising modifying expression in a plant of a nucleic acid encoding a seedyl protein and/or 
modifying the level and/or activity in a plant of a seedyl protein, wherein said seedyl protein 
10 comprises in the following order from N-terminus to C-terminus: 

(i) a motif having at least 80% sequence identity to the sequence represented by 
SEQ ID NO 15; and 

(ii) a motif having at least 80% sequence identity to the sequence represented by 
SEQ ID NO 16, and 

15 (Hi) a motif having at least 80% sequence identity to the sequence represented by 

SEQ ID NO 17 and which is a coiled coil motif; and 
(iv) a motif having at least 80% sequence identity to the sequence represented by 
SEQ ID NO 18, 

wherein the growth characteristics are modified relative to the growth characteristics of 
20 corresponding wild-type plants. 

The present invention also provides a hitherto unknown seedyl protein, which seedyl protein 
comprises in the following order from N-terminus to C-terminus: 

(i) a motif having at least 80% sequence identity to the sequence represented by 
25 SEQ ID NO 15; and 

(ii) a motif having at least 80% sequence identity to the sequence represented by 
SEQ ID NO 16; and 

(iii) a motif having at least 80% sequence identity to the sequence represented by 
SEQ ID NO 17 and which motif is a coiled coil motif; and 

30 (iv) a motif having at least 80% sequence identity to the sequence represented by 

SEQ ID NO 18, 

with the proviso that the seedyl protein is not the Arabidopsis protein as deposited in Genbank 
under NCBI accession number AL161572 (SEQ ID NO 12). 



35 According to a particular embodiment, the motif according to SEQ ID NO: 15 is as represented 
by: (P/X)X((V/UH)(Q/H)(V/I)W(N/X)NA(A/P)(F/C)D > wherein X may be any amino acid and 
wherein 
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5 



(P/X) preferably is P or is A or T or Q or another amino acid 
(V/L/H) preferably is V or L or H 
(Q/H) is either Q or H 

(V/l) is either V or is T or S or another amino acid 
(A/P) is preferably A or is P 



(F/C) is preferably F or is C. 

According to a particular embodiment, the motif according to SEQ ID NO 17 is as represented 
by: <I/V/A)(D/E)XE(I/M)XX(I^ (UWT7I)X(K/Q), where 

10 X may be any amino acid and wherein: 



(K/Q) preferably isKorQ 
and which motif is a coiled coil motif. 

According to a particular embodiment, the motif according to SEQ ID NO 18 is as represented 
25 by: LP(F^)I(R/X)(T/I)(M/X)(P/R)XX(D^ 

(L/R)(V/X)(G/A)K, where X may be any amino acid and wherein 
(R/K) is either R or K 
(R/X) is preferably R or is S or K 
(T/l) is preferably T or I 
30 (M/X) is preferably M or L or A or V 



20 



15 



(l/V/A) preferably is I or V or is A 
(D/E) is either D or E 
(l/M) preferably is I or is M 
(l/V) preferably is I or is V 
(E/Q) preferably is E or is Q 

(l/X) preferably is I or is M or is V or any other amino acid 
(S/X) preferably S or is T or any other amino acid 
(S/X) preferably is S or is T or L or I or A 
(R/K) preferably is R or is K 
(L/V/T/l) preferably is L or T or V or I 



35 



(P/R) is either P or R 

(D/X) is preferably D or is G or T or N 

(E/G) is preferably E or is G 

(SfT) is preferably S or is T 

(P/L) is preferably P or is L 

(C/X) is preferably C or is P or A 

(A/X) is preferably A or is V or I 
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(A/I) is preferably A or is I 

(D/E) is either D or E 

(L/R) is preferably L or is R 

(WX) is preferably V or is Q or N or I 

(G/A) is preferably G or is A. 

The present invention also provides a hitherto unknown isolated seedyl nucleic acid/gene 
selected from: 

(i) a nucleic acid represented by any one of SEQ ID NO: 1, 5 or 7 or the 
complement of any of the aforementioned; 

(ii) a nucleic acid encoding an amino acid sequence represented by SEQ ID NO: 2, 
4, 6, 8 or 10; 

(iii) a nucleic acid encoding a homologue, derivative or active fragment of (i) or (ii) 
above; 

(iv) a nucleic acid capable of hybridising with a nucleic acid of (i), (ii) or (iii) above; 

(v) a nucleic acid which is degenerate as a result of the genetic code from any one 
of the nucleic acids of (i) to (iv) above; 

(vi) a nucleic acid which is an allelic variant of any one of the nucleic acids of (i) to 
(v) above; 

(vii) a nucleic acid which is an alternative splice variant of any one of the nucleic 
acids of (i) to (vi); 

(viii) a nucleic acid encoding a protein which has in increasing order of preference at 
least 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 
34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 
47%, 48%. 49%, 50%. 60%, 70%. 75%, 80%, 85%, 90%, 95%, 96%. 97%. 98% 
or 99% sequence identity to any one or more from the sequences defined in (i) 
to (vii) above; 

(ix) a portion of a nucleic acid according to any of (i) to (viii) above; 

wherein the nucleic acids of (i) to (ix) above encode a seedyl protein as defined hereinabove, 
and with the proviso that the isolated nucleic acid is not a rice cDNA as deposited under 
Genbank accession number AK063941 (SEQ ID NO 3), a Medicago BAC clone deposited as 
AC144618, AC1 39356, AC144482 or AC135566, the Arabidopsis cDNA deposited under 
AL61 572 (SEQ ID NO 1 1) or the Zea mays EST deposited under AY1 08162 (SEQ ID NO 9). 

Modifying expression of a seedyl nucleic acid and/or modifying activity and/or levels of a 
seedyl protein may be effected by modifying expression of a gene and/or by modifying activity 
and/or levels of a gene product, namely a polypeptide, in specific cells or tissues. The term 

5 
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"modifying" as used herein (in the context of modifying expression, activity and/or levels) 
means increasing, decreasing or changing in time or place. The modified expression, activity 
and/or levels of a seedyl gene or protein are modified compared to expression, activity and/or 
levels of a seedyl gene or protein in corresponding wild-type plants. The modified gene 
expression may result from modified expression levels of an endogenous seedyl gene and/or 
may result from modified expression levels of a seedyl gene introduced into a plant. Similarly, 
levels and/or activity of a seedyl protein may be modified due to modified expression of an 
endogenous seedyl nucleic acid/gene and/or due to modified expression of a seedyl nucleic 
acid/gene introduced into a plant. Activity of a seedyl protein may be increased by increasing 
levels of the protein itself. Activity may also be increased without any increase in levels of a 
seedyl protein or even when there is a reduction in levels of a seedyl protein. This may occur 
when the intrinsic properties of the polypeptide are altered, for example, by making a mutant 
form that is more active than the wild type. Mutations may cause conformational changes in a 
protein, resulting in more activity and/or levels of a protein. Modified expression of a 
gene/nucleic acid and/or modifying activity and/or levels of a gene product/protein may be 
effected, for example, by introducing a genetic modification (preferably in the locus of a seedyl 
gene). The locus of a gene as defined herein is taken to mean a genomic region which 
includes the gene of interest and 10KB up- or down stream of the coding region. 

The genetic modification may be introduced, for example, by any one (or more) of the following 
methods: TDNA activation, tilling, site-directed mutagenesis, homologous recombination or by 
introducing and expressing in a plant a nucleic acid encoding a seedyl protein or a 
homologue, derivative or active fragment thereof. Following introduction of the genetic 
modification, there follows a step of selecting for increased expression and/or activity and/or 
levels of a seedyl protein, which increase in expression and/or activity and/or levels gives 
plants having modified growth characteristics. 

T-DNA activation tagging (Hayashi et a/. Science (1992) 1350-1353) involves insertion of T- 
ONA usually containing a promoter (may also be a translation enhancer or an intron), in the 
genomic region of the gene of interest or 10KB up- or downstream of the coding region of a 
gene in a configuration such that the promoter directs expression of the targeted gene. 
Typically, regulation of expression of the targeted gene by its natural promoter is disrupted and 
the gene falls under the control of the newly introduced promoter. The promoter is typically 
embedded in a T-DNA. This T-DNA is randomly inserted into the plant genome, for example, 
through Agrvbacterium infection and leads to overexpression of genes near to the inserted T- 
DNA. The resulting transgenic plants show dominant phenotypes due to overexpression of 
genes close to the introduced promoter. The promoter to be introduced may be any promoter 

6 
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capable of directing expression of a gene in the desired organism, in this case a plant. For 
example, constitutive, tissue-preferred, cell type-preferred and inducible promoters are all 
suitable for use in T-DNA activation. 

5 A genetic modification may also be introduced in the locus of a seedyl gene using the 
technique of TILLING (Targeted Induced Local Lesions IN Genomes). This is a mutagenesis 
technology useful to generate and/or identify, and to isolate mutagenised variants of a seedyl 
nucleic acid. TILLING also allows selection of plants carrying such mutant variants. These 
mutant variants may even exhibit higher seedyl activity than exhibited by the gene in its 

10 natural form. TILLING combines high-density mutagenesis with high-throughput screening 
methods. The steps typically followed in TILLING are: (a) EMS mutagenesis (Redei and 
Koncz, 1992; Feldmann et at., 1994; Lightner and Caspar, 1998); (b) DNA preparation and 
pooling of individuals; (c) PCR amplification of a region of interest; (d) denaturation and 
annealing to allow formation of heteroduplexes; (e) DHPLC, where the presence of a 

IS heteroduplex in a pool is detected as an extra peak in the chromatogram; (f) identification of 
the mutant individual; and (g) sequencing of the mutant PCR product. Methods for TILLING 
are well known in the art (McCallum Nat Biotechnol. 2000 Apr; 18(4): 455-7, reviewed by 
Stemple 2004 (TILLING-a high-throughput harvest for functional genomics. Nat Rev Genet. 
2004 Feb; 5(2): 145-50.)). 

20 

Site directed mutagenesis may be used to generate variants of seedyl nucleic acids or 
portions thereof. Several methods are available to achieve site directed mutagenesis; the 
most common being PCR based methods (current protocols in molecular biology. Wiley Eds. 
http://www.4ulr.com/products/cxirrentprotocols/index.html). 

25 

TDNA activation, TILLING and site-directed mutagenesis are examples of technologies that 
enable the generation of novel alleles and seedyl nucleic acid variants that are therefore 
useful in the methods of the invention. 

30 Homologous recombination allows introduction in a genome of a selected nucleic acid at a 
defined selected position. Homologous recombination is a standard technology used routinely 
in the biological sciences for lower organisms such as yeast or moss (e.g. physcomitrella). 
Methods for performing homologous recombination in plants have been described not only for 
model plants (Offringa et a/. Extrachromosomal homologous recombination and gene targeting 

35 in plant cells after AgrobactBrium-rr\ed\ate6 transformation, 1990 EM BO J. 1990 Oct; 
9(10):3077-84) but also for crop plants, for example rice (Terada R, Urawa H, Inagaki Y, 
Tsugane K, lida S. Efficient gene targeting by homologous recombination in rice. Nat 
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Biotechnol. 2002. lida and Terada: A tale of two integrations, transgene and T-DNA: gene 
targeting by homologous recombination in rice. Curr Opin Biotechnol. 2004 Apr; 15(2):132-8). 
The nucleic acid to be targeted need not be targeted to the locus of a seedyl gene, but may 
be introduced in, for example, regions of high expression. The nucleic acid to be targeted may 
5 be an improved allele used to replace the endogenous gene or may be introduced in addition 
to the endogenous gene. 



A preferred method for introducing a genetic modification is to introduce and express in a plant 
a seedyl nucleic acid/gene or a portion thereof, or sequences capable of hybridising with the 
10 seedyl nucleic acid/gene, which nucleic acid encodes a seedyl protein or a homologue, 
derivative or active fragment thereof. In this case, the genetic modification need not be in the 
locus of a seedyl gene. The nucleic acid may be introduced into a plant by, for example, 
transformation. 



15 Accordingly, the present invention provides a method for modifying the growth characteristics 
of a plant, comprising introducing and expressing in a plant a seedyl nucleic acid/gene or a 
portion thereof, or sequences capable of hybridising with the seedyl nucleic acid/gene, which 
nucleic acid encodes a seedyl protein or a homologue, derivative or active fragment thereof. 

20 Advantageously, the methods according to the invention may also be practised using variant 
nucleic acids and variant amino acids of SEQ ID NO 1 or 2 respectively. The term seedyl 
nucleic acid or seedyl protein encompasses variant nucleic acids and variant amino acids. 
The variant nucleic acids encode seedyl proteins as defined hereinabove, i.e. those 
comprising in the following order from N-terminus to C-terminus: 

25 (i) a motif having at least 80% sequence identity to the sequence represented by SEQ 
ID NO 15; and 

(ii) a motif having at least 80% sequence identity to the sequence represented by SEQ 
ID NO 16; and 

(iii) a motif having at least 80% sequence identity to the sequence represented by SEQ 
30 ID NO 17, and which motif is a coiled coil motif; and 

(iv) a motif having at least 80% sequence identity to the sequence represented by SEQ 
ID NO 18, 

and variant seedyl proteins are those comprising in the following order from N-terminus to C- 
35 terminus: 

(i) a motif having at least 80% sequence identity to the sequence represented by SEQ 
ID NO 15; and 

8 
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a motif having at least 80% sequence identity to the sequence represented by SEQ 
ID NO 16; and 

a motif having at least 80% sequence identity to the sequence represented by SEQ 
ID NO 17, and which motif is a coiled coil motif; and 

a motif having at least 80% sequence identity to the sequence represented by SEQ 
ID NO 18. 

Suitable variant nucleic acid and amino acid sequences useful in practising the method 
according to the invention, include: 
10 (i) Portions of a seedy 1 nucleic acid/gene; 

(ii) Sequences capable of hybridising with a seedyl nucleic acid/gene; 

(iii) Alternative splice variants of a seedyl nucleic acid/gene; 

(iv) Allelic variants of a seedyl nucleic acid/gene; 

(v) Homologues, derivatives and active fragments of a seedyl protein. 

15 

An example of a variant seedyl nucleic acid is a portion of a seedyl nucleic acid. The 
methods according to the invention may advantageously be practised using functional portions 
of a seedyl nucleic acid. A portion refers to a piece of DNA derived or prepared from an 
original (larger) DNA molecule, which DNA portion, when introduced and expressed in a plant, 

20 gives plants having modified growth characteristics and which portion encodes a seedyl 
protein as defined hereinabove. The portion may comprise many genes, with or without 
additional control elements or may contain spacer sequences. The portion may be made by 
making one or more deletions and/or truncations to the nucleic acid. Techniques for 
introducing truncations and deletions into a nucleic acid are well known in the art. Portions 

25 suitable for use in the methods according to the invention may readily be determined by 
following the methods described in the Examples section by simply substituting the sequence 
used in the actual Example with the portion to be tested for functionality. 

An example of a further variant seedyl nucleic acid is a sequence that is capable of 
30 hybridising to a seedyl nucleic acid as defined hereinabove, for example to a seedyl nucleic 
acid as represented by any one of SEQ ID NO 1, 3, 5, 7, 9 or 11. Such hybridising sequences 
are those encoding a seedyl protein as defined hereinabove. Hybridising sequences suitable 
for use in the methods according to the invention may readily be determined for example by 
following the methods described in the Examples section by simply substituting the sequence 
35 used in the actual Example with the hybridising sequence. 



00 
(iii) 

5 (iv) 
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The term "hybridisation" as defined herein is a process wherein substantially homologous 
complementary nucleotide sequences anneal to each other. The hybridisation process can 
occur entirely in solution, i.e. both complementary nucleic acids are in solution. Tools in 
molecular biology relying on such a process include the polymerase chain reaction (PCR; and 
5 all methods based thereon), subtractive hybridisation, random primer extension, nuclease S1 
mapping, primer extension, reverse transcription, cDNA synthesis, differential display of RNAs, 
and DNA sequence determination. The hybridisation process can also occur with one of the 
complementary nucleic acids immobilised to a matrix such as magnetic beads, Sepharose 
beads or any other resin. Tools in molecular biology relying on such a process include the 

10 isolation of poly (A+) mRNA. The hybridisation process can furthermore occur with one of the 
complementary nucleic acids immobilised to a solid support such as a nitro-cellulose or nylon 
membrane or immobilised by e.g. photolithography to e.g. a siliceous glass support (the latter 
known as nucleic acid arrays or microarrays or as nucleic acid chips). Tools in molecular 
biology relying on such a process include RNA and DNA gel blot analysis, colony hybridisation, 

IS plaque hybridisation, In situ hybridisation and mlcroarray hybridisation. In order to allow 
hybridisation to occur, the nucleic acid molecules are generally thermally or chemically 
denatured to melt a double strand into two single strands and/or to remove hairpins or other 
secondary structures from single stranded nucleic acids. The stringency of hybridisation is 
influenced by conditions such as temperature, salt concentration and hybridisation buffer 

20 composition. High stringency conditions for hybridisation include high temperature and/or low 
salt concentration (salts include NaCI and Na 3 -crtrate) and/or the inclusion of formamide in the 
hybridisation buffer and/or lowering the concentration of compounds such as SDS (detergent) 
in the hybridisation buffer and/or exclusion of compounds such as dextran sulphate or 
polyethylene glycol (promoting molecular crowding) from the hybridisation buffer. 

25 Conventional hybridisation conditions are described in, for example, Sambrook (2001) 
Molecular Cloning: a laboratory manual, 3rd Edition Cold Spring Harbor Laboratory Press, 
CSH, New York, but the skilled craftsman will appreciate that numerous different hybridisation 
conditions can be designed in function of the known or the expected homology and/or length of 
the nucleic acid. Sufficiently low stringency hybridisation conditions are particularly preferred 

30 (at least in the first instance) to isolate nucleic acids heterologous to the DNA sequences of the 
invention defined supra. An example of low stringency conditions is 4-6x SSC / 0.1-0.5% w/v 
SDS at 37-45°C for 2-3 hours. Depending on the source and concentration of the nucleic acid 
involved in the hybridisation, alternative conditions of stringency may be employed, such as 
medium stringency conditions. Examples of medium stringency conditions include 1-4x SSC / 

35 0.25% w/v SDS at * 45X for 2-3 hours. An example of high stringency conditions includes 
0.1-1x SSC / 0.1% w/v SDS at 60°C for 1-3 hours. The skilled man will be aware of various 
parameters which may be altered during hybridisation and washing and which will either 



WO 2005/049646 



PCT7EP2004/053030 



maintain or change the stringency conditions. The stringency conditions may start low and be 
progressively increased until there is provided a hybridising seedyl nucleic acid, as defined 
hereinabove. Elements contributing to heterology include allelism, degeneration of the genetic 
code and differences in preferred codon usage. 

5 

Another example of a variant seedyl is an alternative splice variant of a seedyl nucleic 
acid/gene. The methods according to the present invention may also be practised using an 
alternative splice variant of a seedyl nucleic acid. The term "alternative splice variant" as used 
herein encompasses variants of a nucleic acid in which selected introns and/or exons have 

10 been excised, replaced or added. Such splice variants may be found in nature or can be 
manmade using techniques well known in the art. Preferably, the splice variant is a splice 
variant of a sequence represented by any of SEQ ID NO 1, 3, 5, 7, 9 or 11. Splice variants 
suitable for use in the methods according to the invention may readily be determined for 
example by following the methods described in the Examples section by simply substituting the 

IS sequence used in the actual Example with the splice variant. 

Another example of a variant seedyl is an allelic variant. Advantageously, the methods 
according to the present invention may also be practised using allelic variants of a seedyl 
nucleic acid, preferably an allelic variant of a seedyl nucleic acid sequence represented by 

20 any of SEQ ID NO 1, 3, 5, 7, 9 or 11. Allelic variants exist in nature and encompassed within 
the methods of the present invention is the use of these isolated natural alleles in the methods 
according to the invention. Allelic variants encompass Single Nucleotide Polymorphisms 
(SNPs), as well as Small Insertion/Deletion Polymorphisms (INDELs). The size of INDELs is 
usually less than 100 bp). SNPs and INDELs form the largest set of sequence variants in 

25 naturally occurring polymorphic strains of most organisms. Allelic variants suitable for use in 
the methods according to the invention may readily be determined for example by following the 
methods described in the Examples section by simply substituting the sequence used in the 
actual Example with the allelic variant. 

30 Examples of variant seedyl amino acids include homologues, derivatives and active fragments 
of a seedyl protein, preferably of a seedyl protein as represented by any one of SEQ ID NO 
2, 4, 6, 8, 10 or 12. Homologues, derivatives and active fragments of a seedyl protein are 
those comprising in the following order from N-terminus to C-terminus: 

(i) a motif having at least 80% sequence identity to the sequence represented by SEQ 
35 ID NO 15; and 

(ii) a motif having at least 80% sequence identity to the sequence represented by SEQ 
ID NO 16; and 
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(iii) a motif having at least 80% sequence identity to the sequence represented by SEQ 
ID NO 17, and which motif is a coiled coil motif; and 

(iv) a motif having at least 80% sequence identity to the sequence represented by SEQ 
ID NO 18. 

5 

Preferred seedyl homologues, derivatives and active fragments have a coiled coil domain, 
preferably located in the N-terminal half of the protein, more preferably between amino acid 
position 25 to 250, more preferably between position 50 and 150. A coiled coil domain typically 
determines protein folding. 

10 

"Homologues" of a seedyl protein encompass peptides, oligopeptides, polypeptides, proteins 
and enzymes having amino acid substitutions, deletions and/or insertions relative to the 
unchanged protein in question and having similar biological and functional activity as the 
unchanged protein from which they are derived. To produce such homologues, amino acids of 
IS the protein may be replaced by other amino acids having similar properties (such as similar 
hydrophobicity, hydrophilicHy, antigenicity, propensity to form or break a- helical structures or 0- 
sheet structures). Conservative substitution tables are well known in the art (see for example 
Creighton (1984) Proteins. W.H. Freeman and Company). 

20 The homologues of a seedyl protein have a percentage identity to any one of SEQ ID NO 2, 4, 
6, 8, 10 or 12 equal to a value lying between 20% and 99.99%, i.e. in increasing order of 
preference at least 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 
32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 
47%, 48%, 49%, or 50% sequence identity or similarity (functional identity) to the unchanged 

25 protein, alternatively at least 60% sequence identity or similarity to an unchanged protein, 
alternatively at least 70% sequence identity or similarity to an unchanged protein. Typically, the 
homologues have at least 75% or 80% sequence identity or similarity to an unchanged protein, 
preferably at least 85%, 86%, 87%, 88%, 89% sequence identity or similarity, further 
preferably at least 90%, 91%, 92%, 93%, 94% sequence identity or similarity to an unchanged 

30 protein, most preferably at least 95%, 96%, 97%, 98% or 99% sequence identity or similarity to 
an unchanged protein. The percentage identities are when comparing full-length sequences. 
Homologues suitable for use in the methods according to the invention may readily be 
determined for example by following the methods described in the Examples section by simply 
substituting the sequence used in the actual Example with the homologous sequence. 

35 

Percentage identity may be calculated using an alignment program, such alignment programs 

being well known in the art. For example, percentage identity may be calculated using the 
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program GAP, or needle (EMBOSS package) or stretcher (EMBOSS package) or the program 
align X, as a module of the vector NTI suite 5.5 software package, using the standard 
parameters (for example GAP penalty 5, GAP opening penalty 15, GAP extension penalty 
6.6). 

5 

Methods for the search and identification of seedyl homologues or DNA sequences encoding 
a seedyl homologue, would be well within the realm of persons skilled in the art. Such 
methods, involve screening sequence databases with the sequences as provided by the 
present invention in SEQ ID NO 1 and 2 or 3 to 10, preferably a computer readable format of 

10 the nucleic acids of the present invention. This sequence information is available for example 
in public databases, that include but are not limited to Genbank 
rhttp://www.ncbi.nlm.nih.Qov/web/Genbank ). the European Molecular Biology Laboratory 
Nucleic acid Database (EMBL) (http:/w.ebi.ac.uk/ebklocs/embl-db.html) or versions thereof or 
the MIPS database (http://mips.gsf.de/). Different search algorithms and software for the 

IS alignment and comparison of sequences are well known in the art. Such software includes 
GAP, BESTFIT, BLAST, FASTA and TFASTA. GAP uses the algorithm of Needleman and 
Wunsch (J. Mol. Biol. 48: 443-453, 1970) to find the alignment of two complete sequences that 
maximises the number of matches and minimises the number of gaps. The BLAST algorithm 
calculates percentage sequence identity and performs a statistical analysis of the similarity 

20 between the two sequences. The suite of programs referred to as BLAST programs has 5 
different implementations: three designed for nucleotide sequence queries (BLASTN, BLASTX, 
and TBLASTX) and two designed for protein sequence queries (BLASTP and TBLASTN) 
(Coulson, Trends in Biotechnology: 76-80, 1994; Binren et al., GenomeAnalysis, 1: 543, 1997). 
The software for performing BLAST analysis is publicly available through the National Centre 

25 for Biotechnology Information. 

Homologues of SEQ ID NO 2 may be found in many different organisms. The closest 
homologues are found in the plant kingdom. For example, seedyl proteins were isolated from 
tobacco (SEQ ID NO 2), rice (SEQ ID NO 4), medicago (SEQ ID NO 6), sugar cane (SEQ ID 

30 NO 8), maize (SEQ ID NO 10) and from Arabidopsis (SEQ ID NO 12). Furthermore, ESTs from 
other organisms have been deposited in Genbank, for example an EST from Vitis vinifera 
(accession number CA816066), from Pinus taeda (accession number BM903108), from 
Saccharus sp. (accession numbers CA228193 and CA256020), from Citrus sinsensis 
(accession number CF833583), Plumbago zeylanica (accession number CB817788), from Zea 

35 mays (accession number CF637447, AW282224, CD058812, AY108162, CD059048, 
CF041861, AW067243), from Triticum aestivum (CA727065, BJ264506, BJ259034), from 
Hordaum vulgare (accession number BU997034, CA727065, CA031127, BQ762011), from 
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Brassica napus (CD817460) from Gossypium arboreum (BG446106, BM360339), from 
Eschscholzia califomica (CD478368), from Populus tramula (BU821376) and from Beta 
vulgaris (BQ594009). As more genomes are sequenced, many more seedyl homologues will 
be identified. 

5 

The identification of domains or motifs, would also be well within the realm of a person skilled 
in the art and involves for example, a computer readable format of the nucleic acids of the 
present invention, the use of alignment software programs and the use of publicly available 
information on protein domains, conserved motifs and boxes. This protein domain information 

10 is available in the PRODOM 

(http://www.biochem.ud.ac.uk/bsm/dbbrowser/ij/prodomsrchij.html), PIR 
(http://pir.georgetown.edu/) or pFAM (http://pfam.wustl.edu/] database. Sequence analysis 
programs designed for motif searching may be used for identification of fragments, regions and 
conserved domains as mentioned above. Preferred computer programs would include but are 

15 not limited to MEME, SIGNALSCAN, and GENESCAN. A MEME algorithm (Version 3.0) can 
be found In the GCG package; or on the Internet site http://www.sdsc.edu/MEME/meme. 
SIGNALSCAN version 4.0 information is available on the Internet site 
http://biosci.cbs.umn.edu/software/sigscan.htmL GENESCAN can be found on the Internet site 
http://gnomic.stanford.edu/GENESCANW.htmL 

20 

Two special forms of homology, orthologous and paralogous, are evolutionary concepts used 
to describe ancestral relationships of genes. The term "paralogous" relates to gene- 
duplications within the genome of a species. The term "orthologous" relates to homologous 
genes in different organisms due to ancestral relationship and the formation of different 
25 species. The term "homologue* as defined herein also encompasses paralogues and 
orthologues. 



Othologues in, for example, monocot plant species may easily be found by performing a so- 
called reciprocal blast search. This may be done by a first blast involving blasting the 

30 sequence in question (for example, SEQ ID NO: 1 or SEQ ID NO: 2) against any sequence 
database, such as the publicly available NCBI database which may be found at: 
http.7/www. neb i . nlm . nih .gov . If orthologues in rice were sought, the sequence in question 
would be blasted against, for example, the 28,469 full-length cDNA clones from Oryza sativa 
Nipponbare available at NCBI. BLASTn or tBLASTX may be used when starting from 

35 nucleotides or BLASTP or TBLASTN when starting from the protein, with standard default 
values. The blast results may be filtered. The full-length sequences of either the filtered 
results or the non-filtered results are then blasted back (second blast) against the sequences 
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of the organism from which the sequence in question is derived. The results of the first and 
second blasts are then compared. An orthologue is found when the results of the second blast 
give as hits with the highest similarity a seedy 1 nucleic acid or protein; if one of the organisms 
is tobacco then a paralogue is found. In the case of large families, ClustalW may be used, 
5 followed by a neighbour joining tree, to help visualize the clustering. 

Example homologues of a seedyl protein according to SEQ ID NO: 2 include a seedyl protein 
as represented by SEQ ID NO 4 (rice), SEQ ID NO 8 (sugar cane) and SEQ ID NO 10 (maize), 
SEQ ID NO 6 (medicago) and SEQ ID NO 12 (Arabidopsis). The proteins represented by SEQ 
10 ID NO 8 (sugar cane) and SEQ ID NO 10 (Maize) are only partial, but the corresponding full 
length sequences of the proteins and encoding cDNA may easily be determined by a person 
skilled in the art using routine techniques, such as colony hybridization of a cDNA library or 
using PCR based on the use of specific primers combined with degenerate primers. 

IS Another variant of seedyl useful in the methods of the present invention is a derivative of 
seedyl. The term "derivatives" refers to peptides, oligopeptides, polypeptides, proteins and 
enzymes which may comprise substitutions, deletions or additions of naturally and non- 
naturally occurring amino acid residues compared to the amino acid sequence of a naturally- 
occurring form of the protein, for example, as presented in SEQ ID NO: 2. "Derivatives" of a 

20 seedyl protein encompass peptides, oligopeptides, polypeptides, proteins and enzymes which 
may comprise naturally occurring changed, glycosylated, acylated or non-naturally occurring 
amino acid residues compared to the amino acid sequence of a naturally-occurring form of the 
polypeptide. A derivative may also comprise one or more non-amino acid substituents 
compared to the amino acid sequence from which it is derived, for example a reporter 

25 molecule or other ligand, covalently or non-covalently bound to the amino acid sequence such 
as, for example, a reporter molecule which is bound to facilitate its detection, and non-naturally 
occurring amino acid residues relative to the amino acid sequence of a naturally-occurring 
protein. 

30 "Substitutional variants" of a protein are those in which at least one residue in an amino acid 
sequence has been removed and a different residue inserted in its place. Amino acid 
substitutions are typically of single residues, but may be clustered depending upon functional 
constraints placed upon the polypeptide; insertions will usually be of the order of about 1 to 10 
amino acid residues, and deletions will range from about 1 to 20 residues. Preferably, amino 

35 acid substitutions comprise conservative amino acid substitutions. 
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"Insertions! variants" of a protein are those in which one or more amino acid residues are 
introduced into a predetermined site in a protein, insertions can comprise amino-terminal 
and/or carboxy-terminal fusions as well as intra-sequence insertions of single or multiple amino 
acids. Generally, insertions within the amino acid sequence will be smaller than amino- or 
5 carboxy-terminal fusions, of the order of about 1 to 10 residues. Examples of amino- or 
carboxy-terminal fusion proteins or peptides include the binding domain or activation domain of 
a transcriptional activator as used in the yeast two-hybrid system, phage coat proteins, 
(histidine)6-tag, glutathione S-transferase-tag, protein A, maltose-binding protein, dihydrofolate 
reductase, Tag- 100 epitope, c-myc epitope, FLAG®-epitope, lacZ, CMP (calmodulin-binding 
10 peptide), HA epitope, protein C epitope and VSV epitope. 

"Deletion variants" of a protein are characterised by the removal of one or more amino acids 
from the protein. Amino acid variants of a protein may readily be made using peptide synthetic 
techniques well known in the art, such as solid phase peptide synthesis and the like, or by 

15 recombinant DNA manipulations. Methods for the manipulation of DNA sequences to produce 
substitution, insertion or deletion variants of a protein are well known in the art. For example, 
techniques for making substitution mutations at predetermined sites in DNA are well known to 
those skilled in the art and include M13 mutagenesis, T7-Gen in vitro mutagenesis (USB, 
Cleveland, OH), QuickChange Site Directed mutagenesis (Stratagene, San Diego, CA), PCR- 

20 mediated site-directed mutagenesis or other site-directed mutagenesis protocols. 

Another variant of a seedyl protein/amino acid useful in the methods of the present invention 
is an active fragment of a seedyl protein. "Active fragments" of a seedyl protein encompass 
contiguous amino acid residues of a seedyl protein, which residues retain similar biological 
25 and/or functional activity to the naturally occurring protein. Useful fragments are those falling 
within the definition of a seedyl protein as defined hereinabove. Preferably, the fragments start 
at one of the second or third or further internal methionine residues. These fragments originate 
from protein translation, starting at internal ATG codons. 

30 For determining the presence of conserved motifs, sequences are aligned using suitable 
software, such as Align X or clustal X, for indication of the conserved residues (see for 
example Figure 3). Software packages like MEME version 3.0 may also be used to determine 
motifs in sequences. This software is available from UCSD, SDSC and NBCR at 
http://meme.sdsc.edu/meme/. For the identification of a coiled coil domain, the software Coils 

35 2.0 can be used. This software is available at 
http://www.ch.embnet.oro/software/COILS form.html . The 'X* in the motifs represented by 
SEQ ID NO 15, 16, 17 and 18 represents any amino acid. 
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According to a preferred aspect of the present invention, enhanced or increased expression of 
5 a seedyl nucleic acid in a plant or plant part is envisaged. Methods for obtaining increased 
expression of genes or gene products are well documented in the art and include, for example, 
overexpression driven by a (strong) promoter, the use of transcription enhancers or translation 
enhancers. The term overexpression as used herein means any form of expression that is 
additional to the original wild-type expression level. Preferably the seedyl nucleic acid is in the 
10 sense direction with respect to the promoter to which it is operably linked. Alternatively, 
selection of better performing alleles of the wild-type seedyl nucleic acid can be achieved via 
plant breeding techniques. 

The expression of a seedyl gene may be investigated by Northern or Southern blot analysis of 
IS cell extracts. The levels of a seedyl protein in cells may be investigated using Western blot 
analysis of cell extracts. 

According to a further embodiment of the present invention, genetic constructs and vectors to 
facilitate introduction and/or expression of the nucleotide sequences useful in the methods 
20 according to the invention are provided. Therefore, the present invention provides a genetic 
construct comprising: 

(i) A seedyl nucleic acid encoding a seedyl protein as defined hereinabove; 

(ii) one or more control sequences capable of regulating expression of the nucleic 
acid of (i); and optionally 

25 (iii) a transcription termination sequence. 

According to methods of the present invention, such a genetic construct is introduced into a 
plant or plant part. 

30 Constructs useful in the methods according to the present invention may be constructed using 
recombinant DNA technology well known to persons skilled in the art. The gene constructs 
may be inserted into vectors, which may be commercially available, suitable for transforming 
into plants and suitable for expression of the gene of interest in the transformed cells. 

35 The genetic construct may be an expression vector wherein said nucleic acid is operably 
linked to one or more control sequences allowing expression in prokaryotic and/or eukaryotic 
host cells. 

17 
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The nucleic acid according to (i) may be any seedyl nucleic acid as defined hereinabove, 
preferably a seedyl nucleic acid as represented by any one of SEQ ID NO 1, 3, 5, 7, 9 or 11. 
The control sequence of (ii) is preferably a seed-preferred promoter, for example a prolamin 
5 promoter. 

Plants are transformed with a vector comprising the sequence of interest, which sequence is 
operably linked to one or more control sequences (at least a promoter). The terms "regulatory 
element", "control sequence" are all used interchangeably herein and are to be taken in a 
broad context to refer to regulatory nucleic acids capable of effecting expression of the 
sequences to which they are ligated (i.e. operably linked). Encompassed by the 
aforementioned terms are promoters. A "Promoter" encompasses transcriptional regulatory 
sequences derived from a classical eukaryotic genomic gene (including the TATA box which is 
required for accurate transcription initiation, with or without a CCAAT box sequence) and 
additional regulatory elements (i.e. upstream activating sequences, enhancers and silencers) 
which alter gene expression in response to developmental and/or external stimuli, or in a 
tissue-specific manner. Also included within the term is a transcriptional regulatory sequence 
of a classical prokaryotic gene, in which case it may include a -35 box sequence and/or -10 
box transcriptional regulatory sequences. The term "regulatory element" also encompasses a 
synthetic fusion molecule or derivative which confers, activates or enhances expression of a 
nucleic acid molecule in a cell, tissue or organ. The term "operably linked* as used herein 
refers to a functional linkage between the promoter sequence and the gene of interest, such 
that the promoter sequence is able to initiate transcription of the gene of interest. 

25 Advantageously, any type of promoter may be used to drive expression of the seedyl nucleic 
acid. Preferably, the nucleic acid capable of modifying expression of a seedyl gene is 
operably linked to a plant-derived promoter, preferably a plant-derived tissue-preferred 
promoter. The term "tissue -preferred" as defined herein refers to a promoter that is expressed 
predominantly in at least one tissue or organ. Preferably, the tissue-preferred promoter is a 

30 seed-preferred promoter or a seed-specific promoter, further preferably an endosperm- 
preferred promoter, more preferably a promoter isolated from a gene encoding a seed-storage 
protein, most preferably a promoter isolated from a prolamin gene, such as a rice prolamin 
promoter as represented by SEQ ID NO 14 or a promoter of similar strength and/or a promoter 
with a similar expression pattern as the rice prolamin promoter. Similar strength and/or similar 

35 expression pattern may be analysed, for example, by coupling the promoters to a reporter 
gene and checking the function of the reporter gene in tissues of the plant. One well-known 
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reporter gene is beta-glucuronidase and the colorimetric GUS stain used to visualize beta- 
glucuronidase activity in plant tissue. 

Examples of preferred seed-specific promoters and other tissue-specific promoters are 
5 presented in Table A, which promoters or derivatives thereof are useful in performing the 
methods of the present invention. 



TABLE A 



EXAMPLES OF SEED-PREFERRED PROMOTERS FOR USE IN THE PRESENT 
INVENTION 


GENE SOURCE 


EXPRESSION PATTERN 


REFERENCE 


seed-specific genes 


seed 


Simon, et aL, Plant MoL Biol. 5: 191, 
1985; Scofield, et aL, J. Biol. Chem. 
262: 12202, 1987.; Baszczynski, et ah, 
Plant MoL Biol. 14: 633, 1990. 


Brazil Nut albumin 


seed 


Pearson, et aL, Plant Mol. BioL 18: 235- 
245, 1992. 


legumin 


seed 


Ellis, et aL. Plant MoL BioL 10: 203-214, 
1988. 


glutei in (rice) 


seed 


Takarwa, et aL, Mol. Gen. Genet 208: 
15-22, 1986; Takalwa, et aL, FEBS 
Letts. 221: 43-47, 1987. 


zein 


seed 


Matzke et ai riant moi dioi. i^iaj.ozo- 
32 1990 


napA 


seed 


Stalberg, ef al, Planta 199: 51 5 -519, 
1996. 


wheat LMW and HMW 
glutenin-1 


endosperm 


Mol Gen Genet 216:81-90. 1989; NAR 
17:461-2, 1989 


wheat SPA 


seed 


Albani et al, Plant Cell, 9: 171-184, 1997 


wheat a, p, y-glladins 


endosperm 


EMBO 3:1409-15, 1984 


barley Itrl promoter 


endosperm 




barley B1, C, D, hordein 


endosperm 


TTieor Appl Gen 98:1253-62, 1999; 
Plant J 4:343-55, 1993; Mol Gen Genet 
250:750-60, 1996 


barley DOF 


endosperm 


Mena et al, The Plant Journal, 116(1): 
53-62, 1998 


bfz2 


endosperm 


EP991 06056.7 


synthetic promoter 


endosperm 


Vicente-Carbajosa et aL, Plant J. 13: 
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629-640,1998. 


rice prolamin NRP33 


endosperm 


Wu et al t Plant Cell Physiology 39(8) 
885-889, 1998 


rice ct-globulin Glb-1 


endosperm 


Wu etal. Plant Cell Physiology 39(8) 
885^889, 1998 


riceOSHI 


embryo 


Sato et ah Proc. Natl. Acad. Sci. USA, 
93: 8117-8122, 1996 


rice a-globulin 
REB/OHP-1 


endosperm 


Nakase et al Plant Mol. B10L 33: 513- 
522, 1997 


rice ADP-glucose PP 


endosperm 


Trans Res 6:157-68, 1997 


maize ESR gene family 


endosperm 


Plant J 12:235-46, 1997 


sorgum 7-kafirin 


endosperm 


PMB 32:1029-35,1996 


KNOX 


embryo 


Postma-Haarsma et a/, Plant Mol. Biol. 
39:257-71, 1999 


rice oleosin 


embryo and aleuron 


Wu etat, J. Biochem., 123:386, 1998 


sunflower oleosin 


seed (embryo and dry seed) 


Cummins, et aA, Plant Mol. Biol. 19: 
873-876, 1992 


Metallothionein Mte, PRO0001 


transfer layer of embryo + calli 


putative beta-amylase, PRO0005 


transfer layer of embryo 


putative cellulose synthase, PRO0009 


weak in roots 


lipase (putative), PRO0012 




transferase (putative), PRO0014 




peptidyl prolyl cis-trans isomerase (putative), PRO0016 




Unknown, PRO0019 




prp protein (putative). PRO0020 




noduline (putative), PRO0029 




proteinase inhibitor Rgpi9, PRO0058 


seed 


beta expansine EXPB9, PRO0061 


weak in young flowers 


structural protein, PRO0063 


young tissues+calli+embryo 


xylosidase (putative), PRO0069 




prolamine 10 Kda, PRO0075 


strong in endosperm 


allergen RA2, PRO0076 


strong in endosperm 


prolamine RP7, PRO0077 


strong in endosperm 


CBP80, PRO0078 \ 




starch branching enzyme I, PRO0079 




Metallothloneine-like ML2, PRO0080 


transfer layer of embryo + calli 


putative caffeoyl-CoA 3-O-methyttransferase, PRO0081 


shoot I 


prolamine RM9, PRO0087 


strong in endosperm 


prolamine RP6, PRO0090 


strong endosperm 
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prolamine RP5, PRO0091 


strong in endosperm 


allergen RA5, PRO0092 




putative methionine aminopeptidase. PRO0095 


embryo 


ras-related GTP binding protein, PRO0098 




beta expansine EXPB1 , PRO0104 




Glycine rich protein, PRO0105 




metallothionein like protein (putative), PRO0108 




metaltothioneine (putative), PRO0109 




RCc3, PRO0110 


strong root 


uclacyanin 3-like protein, PRO0111 


weak discrimination center / shoot 
meristem 


26S proteasome regulatory particle non-ATPase subunit 1 1 , 
PRO0116 


very weak meristem specific 


putative 40S ribosomal protein, PRO0117 


weak in endosperm 


chlorophyll a/b-binding protein presursor (Cab27), PRO0122 


very weak in shoot 


putative protochlorophyllide reductase, PRO0123 


strong leaves 


metallothionein RiCMT, PRO0126 


strong discrimination center / shoot 
meristem 


GOS2, PRO0129 


strong constitutive 


GOS9. PRO0131 




chitinase Cht-3, PRO0133 


very weak meristem specific 


alpha-globulin, PRO0135 


strong in endosperm 


alanine aminotransferase, PRO0136 


weak in endosperm 


cyclinA2, PRO0138 




Cyclin D2, PRO0139 




Cyclin D3, PRO0140 




cydophyllin 2, PRO0141 


shoot and seed 


sucrose synthase SS1 (barley), PRO0146 


medium constitutive 


trypsin inhibitor ITR1 (barley), PRO0147 


weak in endosperm 


ubiquitine 2 with intron, PRO0149 


strong constitutive 


WSI18, PRO0151 


embryo + stress 


HVA22 homologue (putative), PRO0156 




EL2, PRO0157 




Aquaporine, PRO0169 


medium constitutive in young plants 


High mobility group protein, PRO0170 


strong constitutive 


reversibly glycosylated protein RGP1 , PRO0171 ! 


weak constitutive 


cytosolic MDH, PRO0173 


shoot 


RAB21, PRO0175 


embryo + stress 


CDPK7, PRO0176 
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Cdc2-1, PRO0177 


very weak in meristem 


sucrose synthase 3, PRO0197 




OsVP1. PRO0198 




OSH1,PRCX)200 


very weak in young plant meristem 


putative chlorophyilase, PRO0208 




OsNRT1,PRO0210 




EXP3, PRO0211 




phosphate transporter OjFT1, PRCX)216 




oleosin 18kd, PRO0218 


aleurone + embryo 


ubiquitine 2 without intron, PRO0219 




RFL, PRO0220 




maize UBI delta intron, PRO0221 




glutelin-1 l PRO0223 




fragment of prolamin RP6 promoter, PR(X)224 




4xABRE, PRO0225 




glutelin OSGLUA3, PRO0226 




BLZ-2_short (barley), PRO0227 




BLZ-2_long (barley), PRO0228 





Optionally, one or more terminator sequences may also be used in the construct introduced 
into a plant. The term "terminator" encompasses a control sequence which is a DNA sequence 
at the end of a transcriptional unit which signals 3* processing and polyadenylation of a primary 
5 transcript and termination of transcription. Additional regulatory elements may include 
transcriptional as well as translations enhancers. Those skilled in the art will be aware of 
terminator and enhancer sequences, which may be suitable for use in performing the 
invention. Such sequences would be known or may readily be obtained by a person skilled in 
the art. 

10 

The genetic constructs of the invention may further include an origin of replication sequence 
which is required for maintenance and/or replication in a specific cell type. One example is 
when a genetic construct is required to be maintained in a bacterial cell as an episomal genetic 
element (e.g. plasmid or cosmid molecule). Preferred origins of replication include, but are not 
15 limited to, the f1-ori and colE1 . 

The genetic construct may optionally comprise a selectable marker gene. As used herein, the 

term "selectable marker gene" includes any gene which confers a phenotype on a cell in which 

it is expressed to facilitate the identification and/or selection of cells which are transfected or 

20 transformed with a nucleic acid construct of the invention. Suitable markers may be selected 
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from markers that confer antibiotic or herbicide resistance, that introduce a new metabolic trait 
or that allow visual selection. Examples of selectable marker genes include genes conferring 
resistance to antibiotics (such as npt\\ that phosphorylates neomycin and kanamycin, or hpt t 
phosphorylating hygromycin), to herbicides (for example bar which provides resistance to 
5 Basta; aroA or gox providing resistance against glyphosate), or genes that provide a metabolic 
trait (such as manA that allows plants to use mannose as sole carbon source). Visual marker 
genes result in the formation of colour (for example 3-glucuronidase, GUS), luminescence 
(such as luciferase) or fluorescence (Green Fluorescent Protein, GFP, and derivatives 
thereof). 

10 

In a preferred embodiment, the genetic construct comprises a prolamin promoter from rice 
operably linked to a seedyl nucleic acid in the sense orientation. An example of such an 
expression cassette, further comprising a terminator sequence, is as represented by SEQ ID 
NO 13. 

15 

According to a further embodiment of the present invention, there is provided a method for the 
production of a plant having modified growth characteristics, comprising modifying expression 
and or activity and/or levels in a plant of a seedyl nucleic acid or seedyl protein. 
According to a particular embodiment, the present invention provides a method for the 
20 production of a transgenic plant having modified growth characteristics, which method 
comprises: 

(i) introducing into a plant or plant part a seedyl nucleic acid encoding a seedyl 
protein; 

(ii) cultivating the plant cell under conditions promoting regeneration and mature 
25 plant growth. 

The nudeic acid of (i) may advantageously be any of the aforementioned seedyl nucleic 
acids. 

30 The protein itself and/or the nucleic acid itself may be introduced directly into a plant cell or into 
the plant itself (including introduction into a tissue, organ or any other part of the plant). 
According to a preferred feature of the present invention, the nucleic acid is preferably 
introduced into a plant by transformation. 

35 The term "transformation" as referred to herein encompasses the transfer of an exogenous 
polynucleotide into a host cell, irrespective of the method used for transfer. Plant tissue 
capable of subsequent clonal propagation, whether by organogenesis or embryogenesis, may 
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be transformed with a genetic construct of the present invention and a whole plant regenerated 
therefrom. The particular tissue chosen will vary depending on the clonal propagation systems 
available for, and best suited to, the particular species being transformed. Exemplary tissue 
targets include leaf disks, pollen, embryos, cotyledons, hypocotyls, megagametophytes, callus 
5 tissue, existing meristematic tissue (e.g., apical meristem, axillary buds, and root meristems), 
and induced meristem tissue (e.g. cotyledon meristem and hypocotyl meristem). The 
polynucleotide may be transiently or stably introduced into a host cell and may be maintained 
non-integrated, for example, as a plasmid. Alternatively and preferably, the transgene may be 
stably integrated into the host genome. The resulting transformed plant cell can then be used 
10 to regenerate a transformed plant in a manner known to persons skilled in the art. 

Transformation of a plant species is now a fairly routine technique. Advantageously, any of 
several transformation methods may be used to introduce the gene of interest into a suitable 
ancestor cell. Transformation methods include the use of liposomes, electroporation, 

15 chemicals that increase free DNA uptake, injection of the DNA directly into the plant, particle 
gun bombardment, transformation using viruses or pollen and microprojection. Methods may 
be selected from the calcium/polyethylene glycol method for protoplasts (Krens, F.A. et al., 
1882, Nature 296, 72-74; Negrutiu I. et al., June 1987, Plant Mol. Biol. 8, 363-373); 
electroporation of protoplasts (Shillito R.D. et al., 1985 Bio/Technol 3, 1099-1102); 

20 microinjection into plant material (Crossway A. et al., 1966, Mol. Gen Genet 202, 179-185); 
DNA or RNA-coated particle bombardment (Klein T.M. et al., 1987, Nature 327, 70) infection 
with (non-integrative) viruses and the like. 

Transgenic rice plants expressing a seedyl gene are preferably produced via Agmbacterium- 
25 mediated transformation using any of the well known methods for rice transformation, such as 
described in any of the following: published European patent application EP 1198985 A1, 
Aldemita and Hodges (Planta, 199, 612-617, 1996); Chan et al. (Plant Mol. Biol. 22 (3) 491- 
506, 1993), Hiei et al. (Plant J. 6 (2) 271-282, 1994), which disclosures are incorporated by 
reference herein as if fully set forth. In the case of com transformation, the preferred method is 
30 as described in either Ishida et al. (Nat. Biotechnol. 1996 Jun; 14(6): 745-50) or Frame et al. 
(Plant Physiol. 2002 May; 129(1): 13-22), which disclosures are incorporated by reference 
herein as if fully set forth. 

Generally after transformation, plant cells or cell groupings are selected for the presence of 
35 one or more markers which are encoded by plant-expressible genes co-transferred with the 
gene of interest, following which the transformed material is regenerated into a whole plant. 
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Following DNA transfer and regeneration, putatively transformed plants may be evaluated, for 
instance using Southern analysis, for the presence of the gene of interest, copy number and/or 
genomic organisation. Alternatively or additionally, expression levels of the newly introduced 
DNA may be monitored using Northern and/or Western analysis, both techniques being well 
5 known to persons having ordinary skill in the art. 

The generated transformed plants may be propagated by a variety of means, such as by clonal 
propagation or classical breeding techniques. For example, a first generation (or T1) 
transformed plant may be selfed to give homozygous second generation (or T2) transformants, 
10 and the T2 plants further propagated through classical breeding techniques. 

The generated transformed organisms may take a variety of forms. For example, they may be 
chimeras of transformed cells and non-transformed cells; clonal transformants (e.g., all cells 
transformed to contain the expression cassette); grafts of transformed and untransformed 
IS tissues (e.g., in plants, a transformed rootstock grafted to an untransformed scion). 

The present invention also encompasses plants obtainable by the methods according to the 
present invention. The present invention therefore provides plants obtainable by the method 
according to the present invention, which plants have modified growth characteristics, when 
20 compared to wild-type plants. 

The present invention clearly extends to any plant cell or plant produced by any of the methods 
described herein, and to all plant parts and propagules thereof. The present invention extends 
further to encompass the progeny of a primary transformed or transfected cell, tissue, organ or 
25 whole plant that has been produced by any of the aforementioned methods, the only 
requirement being that progeny exhibit the same genotypic and/or phenotypic characteristics) 
as those produced in the parent by the methods according to the invention i.e. having modified 
growth characteristics. 

30 The invention accordingly also includes host cells comprising an isolated seedy 1 nucleic acid 
as defined hereinabove. Preferred host cells according to the invention are plant cells or cells 
from insects, animals, yeast, fungi, algae or bacteria. The invention also extends to 
harvestable parts of a plant, such as but not limited to seeds, flowers, stamen, leaves, petals, 
fruits, stem, stem cultures, rhizomes, roots, tubers and bulbs. 

35 



The term "plant" as used herein encompasses whole plants, ancestors and progeny of the 
plants and plant parts, including seeds, shoots, stems, roots (including tubers), and plant cells, 
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tissues and organs. The term "plant" also therefore encompasses suspension cultures, 
embryos, meristematic regions, callus tissue, leaves, gametophytes, sporophytes, pollen, and 
microspores. Plants that are particularly useful in the methods of the invention include all 
plants which belong to the superfamily Viridiplantae, in particular monocotyledonous and 
5 dicotyledonous plants including a fodder or forage legume, ornamental plant, food crop, tree, 
or shrub selected from the list comprising Acacia spp., Acer spp. 9 Actinidia spp., Aesculus 
spp., Agathis australis, Albizia amara, Alsophila tricolor, Andmpogon spp., Arachis spp, Areca 
catechu, Astelia fragrans, Astragalus cicer, Baikiaea plurijuga, Betufa spp., Brassica spp., 
Bruguiera gymnorrhiza, Burkea africana, Butea frondosa, Cadaba farinosa, Calliandra spp, 

10 Camellia sinensis, Canna indica, Capsicum spp., Cassia spp., Centroema pubescens, 
Chaenomeles spp., Cinnamomum cassia, Coffea arabica, Colophospermum mopane, 
Coronillia varia, Cotoneaster semtina, Crataegus spp., Cucumis spp., Cupressus spp., 
Cyathea dealbata, Cydonia oblonga, Cryptomeria japonica, Cymbopogon spp., Cynthea 
dealbata, Cydonia oblonga, Dalbergia monetaria, Daval/ia divaricata, Desmodium spp., 

15 Dicksonia squarosa, Diheteropogon amplectens, Dioclea spp, Dolichos spp., Dorycnium 
rectum, Echinochloa pyramidalis, Ehrartia spp., Eleusine coracana, EragresOs spp., Erythrina 
spp., Eucalyptus spp., Euclea schimperi, Eulalia villosa, Fagopymm spp. r Feijoa sellowiana, 
Fragaria spp., Flemingia spp, Freycinetia banksii, Geranium thunbergii, Ginkgo biloba, Glycine 
javanica, Gliricidia spp, Gossypium hirsutum, Grevillea spp., Guibourtia coleosperma, 

20 Hedysamm spp., Hemarthia aftissima, Heteropogon contortus, Hordeum vulgara, Hyparrhenia 
rufa, Hypericum erectum, Hyperthelia dissoluta, Indigo incamata, Iris spp., Leptarriiena 
pyrolrfblia, Lespediza spp., Lettuca spp., Leucaena leucocephala, Loudetia simplex, Lotonus 
bainesii, Lotus spp., Macrotyloma axillare, Malus spp., Manihot esculenta, Medicago sathra, 
Metasequoia glyptostroboides, Musa sapientum, Nicotianum spp., Onobrychis spp., 

25 Omhhopus spp., Oryza spp., Peltophorum africanum, Pennisetum spp., Persea gratissima, 
Petunia spp., Phaseolus spp., Phoenix canariensis, Phormium cookianum, Photinia spp., 
Picea gfauca, Pinus spp., Pisum sativum, Podocarpus totara, Pogonarthria fleckii, 
Pogonarthria squamosa, Populus spp., Prosopis cineraria, Pseudotsuga menziesii, 
Pterolobium stellatum, Pyrus communis, Quercus spp., Rhaphiolepsis umbellate, 

30 Rhopalostylis sapida, Rhus natalensis, Ribes grossularia, Ribes spp., Robinia pseudoacacia, 
Rosa spp., Rubus spp., Salix spp., Schyzachyrium sanguineum, Sciadopitys verticillata, 
Sequoia sempervirens, Sequoiadendron giganteum, Sorghum bicolor, Spinacia spp., 
Sporobolus fimbriatus, Stiburus alopecuroides, Stylosanthos humilis, Tadehagi spp, Taxodium 
distichum, Themeda triandra, Trifolium spp., TriScum spp., Tsuga heterophylla, Vaccinium 

35 spp., Vicia spp., Vitis vin'rfera, Watsonia pyramidata, Zantedeschia aethiopica, Zea mays, 
amaranth, artichoke, asparagus, broccoli, Brussels sprouts, cabbage, canola, carrot, 
cauliflower, celery, collard greens, flax, kale, lentil, oilseed rape, okra, onion, potato, rice, 
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soybean, strawberry, sugar beet sugar cane, sunflower, tomato, squash, tea, trees. 
Alternatively algae and other non-Viridiplantae can be used for the methods of the present 
invention. According to a preferred embodiment of the present invention, the plant is a crop 
plant such as soybean, sunflower, canola, alfalfa, rapeseed, cotton, tomato, potato or tobacco. 
5 Further preferably, the plant is a monocotyledonous plant such as sugar cane. More 
preferably the plant is a cereal, such as rice, maize, wheat barley, millet, rye, sorghum or oats. 

Advantageously, the present invention provides a method for modifying growth characteristics 
of a plant, which modified growth characteristics are selected from any one or more of 
10 increased yield, increased biomass, modified plant architecture. 

Further preferably, increased yield is increased seed yield. 

The term "increased yield" encompasses an increase in biomass in one or more harvestable 
IS parts of a plant relative to the total biomass of corresponding wild-type plants. The term also 
encompasses an increase in seed yield, which includes an increase in the biomass of the seed 
(seed weight) and/or an increase in the number of (filled) seeds and/or in the size of the seeds 
and/or an increase in seed volume, each relative to corresponding wild-type plants. An 
increase in seed size and/or volume may also influence the composition of seeds. An increase 
20 in seed yield could be due to an increase in the number and/or size of flowers. An increase in 
yield might also increase the harvest index, which is expressed as a ratio of the total biomass 
over the yield of harvestable parts, such as seeds. 

The methods of the present invention are used to increase the seed yield of the plant and are 
25 therefore particularly favourable to be applied to crop plants, preferably seed crops and 
cereals. Therefore, the methods of the present invention are particularly useful for plants such 
as, rapeseed, sunflower, leguminosae (e.g. soybean, pea, bean) flax, lupinus, canola and 
cereals such as rice, maize, wheat, barley, millet, oats and rye. 

30 Further preferably, increased biomass encompasses increased biomass of aboveground plant 
tissue, herein determined as aboveground plant area. 

Additionally or alternatively, the plants according to the invention have increased aboveground 
area relative to corresponding wild type plants. 

35 Further preferably, said modified plant architecture encompasses increased number of 
panicles and increased biomass relative to corresponding wild type plants. 
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The present invention also relates to use of a seedyl nucleic acid and and/or protein in 
modifying plant growth characteristics. 

According to another aspect of the present invention, the seedyl nucleic acid and/or seedyl 
5 protein may be used in breeding programmes. In an example of such a breeding programme, 
a DNA marker is identified which may be genetically linked to a seedyl nucleic acid/gene. This 
DNA marker may then be used in breeding programs to select plants having modified growth 
characteristics relative to corresponding wild type plants. 

10 The methods according to the present invention result in plants having modified growth 
characteristics! as described hereinbefore. These advantageous characteristics may also be 
combined with other economically advantageous traits, such as further yield-enhancing traits, 
tolerance to various stresses, traits modifying various architectural features and/or biochemical 
and/or physiological features. 

15 

Description of the Figures 

The present invention will now be described with reference to the following Figures in which: 

Figure 1 is a schematic presentation of the entry clone, containing CDS0689 within the AttL1 
20 and AttL2 sites for Gateway® cloning in the pDONR201 backbone. CDS0689 is the internal 
code for the Nicotians tabacum BY2 CDS0689 seedyl coding sequence. This vector contains 
also a bacterial kanamycine-resistance cassette and a bacterial origin of replication. 

Figure 2 is a map of the binary vector for the expression in Oryza sativa of the Nicotiana 
25 tabacum BY2 seedyl gene (CDS0689) under the control of the rice prolamin RP6 promoter 
(PRO0090). This vector contains a T-DMA derived from the Ti Plasmid, limited by a left border 
(LB repeat, LB Ti C58) and a right border (RB repeat, RB Ti C58)). From the left border to the 
right border, this T-DNA contains: a selectable marker cassette for antibiotic selection of 
transformed plants; a screenable marker cassette for visual screening of transformed plants; 
30 the PRO0090 - CDS0689 -zein and rbcS-deltaGA double terminator cassette for expression of 
the Nicotiana tabacum BY2 seedyl gene (CDS0689). This vector also contains an origin of 
replication from pBR322 for bacterial replication and a selectable marker (Spe/SmeR) for 
bacterial selection with spectinomycin and streptomycin. 

35 Figure 3 shows an N-terminal and C-terminal alignment of seedyl amino acids and deduced 

amino acids from ESTs, all from plants. This alignment was made with the program Align X of 

the VNTI software package. Motifs 1, 2, 3 and 4 are indicated with a bar. 
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Figure 4 is the representation of nucleic acids, protein and motif sequences according to the 
invention. 

s Examples 

The present invention will now be described with reference to the following examples, which 
are by way of illustration alone. 

Unless otherwise stated, recombinant DNA techniques were performed according to standard 
10 protocols described in Sambrook (2001) Molecular Cloning: a laboratory manual, 3rd Edition 
Cold Spring Harbor Laboratory Press, CSH, New York; or in Volumes 1 and 2 of Ausubel etal. 
(1988), Current Protocols in Molecular Biology, Current Protocols. Standard materials and 
methods for plant molecular work are described in Plant Molecular Biology Labfase (1993) by 
R.D.D. Croy, published by BIOS Scientific Publications Ltd (UK) and Blackwell Scientific 
IS Publications (UK). 

Example 1: cloning of the seedyl encoding gene 

A cDNA-AFLP experiment was performed on a synchronized tobacco BY2 cell culture 
(Nicotiana tabacum L. cv. Bright Yellow-2), and BY2 expressed sequence tags that were cell 
20 cycle modulated were identified and elected for further cloning. Subsequently, the expressed 
sequence tags were used to screen a tobacco cDNA library and to isolate the full-length cDNA 
of interest, namely the cDNA coding for the seedyl protein of the present invention 
(CDS0689). 

25 Synchronization of BY2 cells. 

Tobacco BY2 (Nicotiana tabacum L. cv. Bright Yellow - 2) cultured cell suspension was 
synchronized by blocking cells in early S-phase with aphidicolin as follows. Cultured cell 
suspension of Nicotiana tabacum L. cv. Bright Yellow 2 were maintained as described (Nagata 
et al. Int. Rev. Cytol. 132, 1-30, 1992). For synchronization, a 7-day-old stationary culture was 
30 diluted 10-fold in fresh medium supplemented with aphidicolin (Sigma-Aldrich, St. Louis, MO; 5 
mg/l), a DNA-polymerase a inhibiting drug. After 24 h, cells were released from the block by 
several washings with fresh medium and resumed their cell cycle progression. 

RNA extraction and cDNA synthesis. 
35 Total RNA was prepared by using LiCI precipitation (Sambrook et al, 2001) and poly(A+) RNA 
was extracted from 500 mg of total RNA using Oligotex columns (Qiagen, Hilden, Germany) 
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according to the manufacturer's instructions. Starting from 1 mg of poly(A+) RNA, first-strand 
cDNA was synthesized by reverse transcription with a biotinylated oligo-dT25 primer (Genset, 
Paris, France) and Superscript II (Life Technologies, Gaithersburg, MD). Second-strand 
synthesis was done by strand displacement with Escherichia coli ligase (Life Technologies), 
5 and DNA polymerase I (USB, Cleveland, OH) and RNAse-H (USB). 

cDNA-AFLP analysis. 

Five hundred ng of double-stranded cDNA was used for AFLP analysis as described (Vos et 
al., Nucleic Acids Res. 23 (21) 4407-4414, 1995; Bachem et al., Plant J. 9 (5) 745-53. 1996). 

10 The restriction enzymes used were BstYl and Msel (Biolabs) and the digestion was done in 
two separate steps. After the first restriction digest with one of the enzymes, the 3* end 
fragments were collected on Dyna beads (Dynal, Oslo, Norway) by means of their biotinylated 
tail, while the other fragments were washed away. After digestion with the second enzyme, the 
released restriction fragments were collected and used as templates in the subsequent AFLP 

IS steps. For preamplifications, an Msel primer without selective nucleotides was combined with a 
BstYl primer containing either a T or a C as 3' most nucleotide. PCR conditions were as 
described (Vos et al., 1995). The obtained amplification mixtures were diluted 600-fold and 5 
ml was used for selective amplifications using a P33 -labeled BstYl primer and the Amplitaq- 
Gold polymerase (Roche Diagnostics, Brussels, Belgium). Amplification products were 

20 separated on 5% polyacrylamide gels using the Sequigel system (Biorad). Dried gels were 
exposed to Kodak Biomax films as well as scanned in a phospholmager (Amersham 
Pharmacia Biotech, Little Chalfbnt, UK). 

Characterization of AFLP fragments. 

25 Bands corresponding to differentially expressed transcripts, among which the (partial) 
transcript corresponding to CDS0669, were isolated from the gel and eluted DNA was 
reamplified under the same conditions as for selective amplification. Sequence information 
was obtained either by direct sequencing of the reamplified polymerase chain reaction product 
with the selective BstYl primer or after cloning the fragments in pGEM-T easy (Promega, 

30 Madison, Wl) or sequencing of individual clones. The obtained sequences were compared 
against nucleotide and protein sequences present in the publicly available databases by 
BLAST sequence alignments (Altschul et al., Nucleic Acids Res. 25 (17) 3389-3402 1997). 
When available, tag sequences were replaced with longer EST or isolated cDNA sequences to 
increase the chance of finding significant homology. The physical cDNA clone corresponding 

35 to CDS0689 was subsequently amplified from a commercial Tobacco cDNA library as follows. 
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Cloning of a tobacco CDS0689 seedyl gene (CDS0689) 

A c-DISIA library with average inserts of 1,400 bp was made with poly(A+) isolated from actively 
dividing, non-synchronized BY2 tobacco cells. These library-inserts were cloned in the vector 
pCMVSPORT6.0, comprising an attB gateway cassette (Life Technologies). From this library 
5 46,000 clones were selected, arrayed in 384-well microtiter plates, and subsequently spotted 
in duplicate on nylon filters. The arrayed clones were screened by using pools of several 
hundreds of radioactively labelled tags as probe (among which the BY2-tag corresponding to 
the sequence CDS0689). Positive clones were isolated (among which the clone reacting with 
the BY2-tag corresponding to the sequence CDS0689), sequenced, and aligned with the tag 

10 sequence. Alternatively, when the hybridization with the tag would fail, the full-length cDNA 
corresponding to the tag was selected by PCR amplification as follows. Tag-specific primers 
was designed using primer3 program (http://www- 

genome.wi.mit.edu/genome_software/other/primer3.html) and used in combination with the 
common vector primer to amplify partial cDNA inserts. Pools of DNA from 50.000, 100.000, 

IS 150.000, and 300.000 cDNA clones were used as templates in the PCR amplifications. 
Amplification product were isolated from agarose gels, cloned, sequenced and aligned with 
tags. The vector comprising the sequence CDS0689 and obtained as described above, was 
referred to as entry clone. 

20 Example 2: Vector construction for transformation with PRO0090-CDS0689 
cassette 

The entry clone was subsequently used in a Gateway™ LR reaction with p0830, a destination 
vector used for Oryza sativa transformation. This vector contains as functional elements within 
the T-DNA borders: a plant selectable marker; a plant screenable marker; and a Gateway 
25 cassette intended for LR in vivo recombination with the sequence of interest already cloned in 
the entry clone. The rice prolamin RP6 promoter for endosperm-specific expression 
(PRO0090) is located upstream of this Gateway cassette. 

After the LR recombination step, the resulting expression vector as shown in Fig. 2 was 
30 transformed into Agrobacterium and subsequently into Oryza sativa plants. Transformed rice 
plants were allowed to grow and then examined for various parameters as described in 
Example 3. 
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Example 3: Evaluation of transgenic rice plants transformed with 
prolamin:: seedy 1 (PRO0090-CDS0689) and results 

Approximately 15 to 20 independent TO rice transfbmnants were generated. The primary 
5 transformants were transferred from tissue culture chambers to a greenhouse for growing and 
harvest of T1 seed. Four events of which the T1 progeny segregated 3:1 for presence/absence 
of the transgene were retained. For each of these events, approximately 10 T1 seedlings 
containing the transgene (hetero- and homo -zygotes), and approximately 10 T1 seedlings 
lacking the transgene (nullizygotes), were selected by monitoring screenable marker 
10 expression. 

Two events (60 plants per event of which 30 positives for the transgene and 30 negative) 
having improved agronomical parameters in T1 were chosen for re-evaluation in T2. T1 and T2 
plants were transferred to the greenhouse and evaluated for vegetative growth parameters and 
15 seed parameters, as described below. 

Statistical analysis: t-test and F-test 

A two factor ANOVA (analysis of variants) was used as statistical model for the overall 
evaluation of plant phenotypic characteristics. An F-test was carried out on all the parameters 

20 measured, for all of the plants of all of the events transformed with the gene of interest. The 
F-test was carried out to check for an effect of the gene over all the transformation events and 
to determine the overall effect of the gene or "global gene effect". Significant data, as 
determined by the value of the F-test, indicates a "gene" effect, meaning that the phenotype 
observed is caused by more than the presence or position of the gene. In the case of the F- 

25 test, the threshold for significance for a global gene effect is set at a 5% probability level. 

Vegetative growth measurements 

The selected transgenic plants were grown in a greenhouse. Each plant received a unique 
barcode label to link unambiguously the phenotyping data to the corresponding plant. The 

30 selected transgenic plants were grown , on soil in 10 cm diameter pots under the following 
environmental settings: photoperiod= 11.5 h, daylight intensity^ 30,000 lux or more, daytime 
temperature= 28°C or higher, night time temperature^ 22°C, relative humidKy= 60-70%. 
Transgenic plants and the corresponding nullizygotes were grown side-by-side at random 
positions. From the stage of sowing until the stage of maturity each plant was passed several 

35 times through a digital imaging cabinet and imaged. At each time point digital images 
(2048x1536 pixels, 16 million colours) were taken of each plant from at least 6 different angles. 
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The parameters described below were derived in an automated way from all the digital images 
of all the plants, using image analysis software. 



(a) Aboveground plant area 
5 Plant aboveground area was determined by counting the total number of pixels from 
aboveground plant parts discriminated from the background. This value was averaged for the 
pictures taken on the same time point from the different angles and was converted to a 
physical surface value expressed in square mm by calibration. Experiments show that the 
aboveground plant area measured this way correlates with the biomass of plant parts above 
10 ground. 

b) Number of primary panicles 

The tallest panicle and all the panicles that overlap with the tallest panicles when aligned 
vertically were counted manually, and considered as primary panicles. 

15 

Seed-related parameter measurements 

The mature primary panicles of T1 and T2 plants were harvested, bagged, barcode-labelled 
and then dried for three days in the oven at 37°C. The panicles were then threshed and all the 
seeds were collected and counted. The filled husks were separated from the empty ones 
20 using an air-blowing device. The empty husks were discarded and the remaining fraction was 
counted again. The filled husks were weighed on an analytical balance. This procedure 
resulted in the set of seed-related parameters described below. 



fc) Number of filled seeds 
25 The number of filled seeds was determined by counting the number of filled husks that 
remained after the separation step. 



Id) Total seed yield per plant 

The total seed yield was measured by weighing all filled husks harvested from a plant. 

30 

The results show % difference between positive plants and corresponding nullizygotes 
(negative) plants of a transgenic line. The values given in Tables 1 to 4 represent the average 
for two T1 lines and the same two T2 lines. 



35 
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Table 1: overview of phenotypic data of seedy 1 transgenic T1 and T2 plants for above 
ground area 





% difference between dos. and nea. olants for above around area 




T1 plants 


T2 plants 




2 lines 


+ 51 % 






2 lines 




+ 25.5 % 





Table 2: overview of phenotypic data of seedyl transgenic T1 and T2 plants for number 
5 of first panicles 





% difference between dos. and neq. plants for nr. of first panicles 




T1 plants 


T2 plants 




2 lines 


+ 101 % 






2 lines 




+ 26.5 % 





Table 3: overview of phenotypic data of seedyl transgenic T1 and T2 plants for number 
of filled seeds 





% difference between dos. and nea. plants for nr. of filled seeds 




T1 plants 


T2 plants 




2 lines 


+ 137% 






2 lines 




+ 36.5 % 





10 Table 4: overview of phenotypic data of seedyl transgenic T1 and T2 plants for total 
seed weight per plant 





% difference between dos. and nea. plants for total seed weiaht per plant 




T1 plants 


T2 plants 




2 lines 


+ 152% 






2 lines 




+ 47% 
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